An empirical study of datapath, memory hierarchy, and network in SIMD array architectures

نویسندگان

  • Martin C. Herbordt
  • Charles C. Weems
چکیده

Although SIMD arrays have been built for 30 years, they have as a class been the subject of few empirical design studies. Using ENPASSANT, a simulation environment developed for that purpose, we analyze several aspects of SIMD array architecture with respect to a test suite of spatially mapped applications. Several surprising results are obtained. With respect to memory hierarchy, we nd that adding a level of cache to current PE designs is likely to be advantageous , but that such a cache will look quite diierent than expected. In particular, we nd that associativity has unusual signiicance and that performance varies inversely with block size. Router network results indicate the importance of support for local transfers, broadcast, and reduction even at the expense of arbitrary permutations. Other communication results point to the appropriate dimensionality of k-ary n-cube networks (2 or 3), and the criticality of supporting bidi-rectional transfers, even if the overall bandwidth remains unchanged.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Emulations to Enhance the Performance of Parallel Architectures

ÐWe illustrate the potential of techniques and results from the theory of network emulations to enhance the performance of a parallel architecture. The vehicle for this demonstration is a suite of algorithms that endow an N-processor bit-serial processor array A with a ameta-instructiono GAUGE k, which (logically) reconfigures A into an N=k-processor virtual machine Bk that has: 1) a datapath a...

متن کامل

Flexible Parallel Processing in Memory: Architecture + Programming Model

VLSI technology continues to develop at a staggering rate presenting two challenges to computer designers: (i) how to capitalize on the additional resources that are available on a chip; and (ii) how to evolve computer architecture models that are well matched to the signi cantly changed physical parameters of new technology and the expanding needs of applications. One of the chief challenges i...

متن کامل

Size Tradeo s in theDesign of SIMD Arrays for aSpatially Mapped Workload

Though massively parallel SIMD arrays continue to be promising for many computer vision applications, they have undergone few systematic empirical studies. The problems include the size of the architecture space, the lack of portability of the test programs, and the inherent complexity of simulating up to hundreds of thousands of processing elements. The latter two issues have been addressed pr...

متن کامل

A Sub-mW H.264 Baseline-Profile Motion Estimation Processor Core with a VLSI-Oriented Block Partitioning Strategy and SIMD/Systolic-Array Architecture

We propose a sub-mW H.264 baseline-profile motion estimation processor for portable video applications. It features a VLSIoriented block partitioning strategy and low-power SIMD/systolic-array datapath architecture, where the datapath can be switched between an SIMD and systolic array depending on processing flow. The processor supports all the seven kinds of block modes, and can handle three r...

متن کامل

Pentium III Processor Implementation Tradeoffs

This paper discusses the implementation tradeoffs of the Pentium III processor. The Pentium III processor implements a new extension of the IA-32 instruction set called the Internet Streaming Single-Instruction, MultipleData (SIMD) Extensions (Internet SSE). The processor is based on the Pentium Pro processor microarchitecture. The initial development goals for the Pentium III processor were ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995